Skip to main content

All Questions

4votes
1answer
54views

Unsupervised Isolation Forrest sklearn hyperparameters

I am using sklearn's IsolationForest for unsupervised anomaly detection task. According to the docs, https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html, there are ...
Mar's user avatar
  • 85
3votes
1answer
37views

Confirm understanding of decision_function in Isolation Forest sklearn

I am looking to better understand sklearn IsolationForest decision_function. My understanding is that if the metric is closer to -1 then the model is more confident ...
Mar's user avatar
  • 85
2votes
0answers
36views

Determine best hyperprameteres in GridSearch - Isolation Forest

I have implemented an Isolation Forest algorithm for anomaly detection (unsupervised learning), where I divided my dataset into 1000 subsets, and for each subset, there is one isolation tree. This ...
Learner's user avatar
4votes
2answers
177views

Loss function in Isolation Forest

I have recently came across on this algorithm and was working on my graduation project. As per my understanding, we creates sub trees for each sub samples. Then we calculates the scores for each ...
Mayank Singh's user avatar
0votes
0answers
81views

Confused with Isolation Forest

Let say, I have the anomaly detection (unsupervised learning) dataset with 10 observations (two features). The datasets is like below: After executing the model, following are the results (anomalies ...
Bits's user avatar
  • 131
0votes
1answer
75views

detecting abnormality in a specific feature with respect to others (unsupervised?)

I have a large dataset with a feature y which is dependent in part on features x1 and x2. All features are noisy, and y is also dependent on other parameters not captured in the dataset. I would like ...
user18236139's user avatar
0votes
0answers
111views

Understanding Isolation Forest predictions

I'm running sklearn's IsolationForest on a dataset containing 2 classes of data, one that I know is the anomaly (~1.5% of the entire dataset), the other is the normal dataset. I'm using this (shuffled)...
Rayne's user avatar
1vote
1answer
411views

regarding computing the centroid of high dimensional data

In scikit-learn, or other python libraries, are there any existing implementations to compute centroid for high dimensional data sets?
user297850's user avatar
0votes
2answers
2kviews

Anomaly (Outlier) Detection with Isolation Forest too sensitive even with low contamination

I'm trying to use the sklearn implementation of the Isolation Forest algorithm to detect anomalies in my time series data. However, even with a very low contamination parameter (0.0001), it is ...
NewbierThanANewbie's user avatar
3votes
1answer
283views

Geolocation Based Anomaly Detection in IPs Using Isolation Forest

I'm trying to detect anomalies based on geolocation from IP addresses on a server access log file. I have created two features country and geo_velocity, using the IP address and the timestamp of each ...
Nipun Thennakoon's user avatar
1vote
2answers
1kviews

Cross-Validation in Anomaly Detection with Labelled Data

I am working on a project where I train anomaly detection algorithms Isolation Forest and Auto-Encoder. My data is labelled so I have the ground truth but the nature of the problem requires ...
meliksahturker's user avatar
5votes
1answer
3kviews

Interpretation of scikit-learn one class svm scores

How can I interpret the scores generated by the function score_samples(X) from a scikit-learn OneClassSVM model? Is there a way ...
ElBrocas's user avatar
2votes
2answers
5kviews

What does the classification report interpret? Class 1 indicates abnormal data

How to interpret the report and How is precision, recall values are calculated for individual class labels. What is the significance of macro avg ? Does this report signify a good predictions by the ...
prnai's user avatar
1vote
0answers
116views

Custom Decision Function for Custom Outlier Detection Algorithm

I have built a custom algorithm for semi-supervised anomaly detection and here is my output example as following with probability threshold set to 0.05 and 1 = outlier, 0 = inlier: ...
Klaudijus's user avatar
1vote
1answer
57views

How do I evaluate a K-Means unsupervised anomaly detection approach?

how do I evaluate K-means clustering anomaly detection method as there is no labelled data of anomaly class. To find the cluster (K), I have used the silhouette score from Scikit learn library. Scikit ...
Nite's user avatar

153050per page
close